Skip to content

MAINT: Refactor SkeletonKeyAttack to single-turn prepended conversati…#1739

Open
ksaravindakashyap wants to merge 1 commit into
microsoft:mainfrom
ksaravindakashyap:maint/refactor-skeleton-key-subclass
Open

MAINT: Refactor SkeletonKeyAttack to single-turn prepended conversati…#1739
ksaravindakashyap wants to merge 1 commit into
microsoft:mainfrom
ksaravindakashyap:maint/refactor-skeleton-key-subclass

Conversation

@ksaravindakashyap
Copy link
Copy Markdown

Title:

MAINT: Refactor SkeletonKeyAttack to single-turn prepended conversation pattern
Body:

Description

Refactors SkeletonKeyAttack to follow the same pattern as FlipAttack — using _setup_async to prepend a simulated skeleton key exchange (user prompt + assistant acceptance) as conversation history, then sending the objective in a single real API call via the parent PromptSendingAttack._perform_async.

Previously, the attack made two real API calls: one to send the skeleton key prompt and await the model's response, then a second for the objective. This replaces that custom two-turn flow with the standard prepended conversation mechanism already used across the codebase.

Related: #650

Changes:

  • _setup_async added to set context.prepended_conversation with [user: skeleton_key, assistant: acceptance]
  • SkeletonKeyAttack._perform_async override removed — parent handles execution
  • New skeleton_key_acceptance constructor parameter (optional, mirrors skeleton_key_prompt)
  • New default acceptance prompt file: pyrit/datasets/executors/skeleton_key/skeleton_key_acceptance.prompt
  • Removed _send_skeleton_key_prompt_async, _create_skeleton_key_failure_result, _load_skeleton_key_prompt
  • executed_turns in AttackResult now returns 1 instead of 2 to reflect the true single-turn nature

@romanlutz

Tests and Documentation

  • Unit tests in tests/unit/executor/attack/single_turn/test_skeleton_key.py rewritten to cover the new _setup_async behaviour (19 tests, all passing)
  • doc/code/executor/attack/skeleton_key_attack.py and .ipynb updated with revised description reflecting the single-turn approach; stale pre-run notebook outputs cleared
  • JupyText not re-executed (requires live API credentials)

@ksaravindakashyap
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors SkeletonKeyAttack from a two-call flow to a single target call with a simulated skeleton-key exchange prepended as conversation history.

Changes:

  • Adds default simulated acceptance prompt and constructor override.
  • Moves skeleton-key setup into _setup_async and relies on parent execution.
  • Updates unit tests and synced documentation/notebook content.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pyrit/executor/attack/single_turn/skeleton_key.py Implements prepended skeleton-key conversation setup.
pyrit/datasets/executors/skeleton_key/skeleton_key_acceptance.prompt Adds default simulated assistant acceptance text.
tests/unit/executor/attack/single_turn/test_skeleton_key.py Rewrites tests around setup and initialization behavior.
doc/code/executor/attack/skeleton_key_attack.py Updates documentation for the single-turn pattern.
doc/code/executor/attack/skeleton_key_attack.ipynb Syncs notebook text/code and clears stale outputs.
Comments suppressed due to low confidence (2)

pyrit/executor/attack/single_turn/skeleton_key.py:112

  • This setup drops the configured request converters when adding the skeleton-key prompt as prepended history. The parent setup passes request_converters=self._request_converters, and the previous implementation sent the skeleton-key prompt through _send_prompt_to_objective_target_async, so converters applied to it; after this change they apply only to the final objective turn. Pass the converter configuration through when initializing the prepended conversation, or otherwise preserve the previous converter behavior.
        await self._conversation_manager.initialize_context_async(
            context=context,
            target=self._objective_target,
            conversation_id=context.conversation_id,
            memory_labels=self._memory_labels,

tests/unit/executor/attack/single_turn/test_skeleton_key.py:247

  • The rewritten setup tests patch out initialize_context_async and only assert a few arguments, so they no longer cover that configured request converters are preserved when the skeleton-key prompt is moved into prepended history. Please add a test that exercises _setup_async with an AttackConverterConfig and verifies the converters are passed through/applied to the prepended skeleton-key message.
        mock_init = AsyncMock()
        with patch.object(attack._conversation_manager, "initialize_context_async", mock_init):
            await attack._setup_async(context=basic_context)

        mock_init.assert_called_once()
        call_kwargs = mock_init.call_args.kwargs
        assert call_kwargs["context"] is basic_context
        assert call_kwargs["target"] is mock_target
        assert call_kwargs["conversation_id"] == basic_context.conversation_id

Comment on lines +108 to +112
await self._conversation_manager.initialize_context_async(
context=context,
target=self._objective_target,
conversation_id=context.conversation_id,
objective=context.objective,
atomic_attack_identifier=build_atomic_attack_identifier(attack_identifier=self.get_identifier()),
last_response=None,
last_score=None,
outcome=AttackOutcome.FAILURE,
outcome_reason="Skeleton key prompt was filtered or failed",
executed_turns=1,
labels=context.memory_labels,
memory_labels=self._memory_labels,
Comment on lines +188 to +201
with patch.object(attack._conversation_manager, "initialize_context_async", new_callable=AsyncMock):
await attack._setup_async(context=basic_context)

# Check that skeleton key prompt was included in message
message = call_args.kwargs["message"]
assert isinstance(message, Message)
assert len(message.message_pieces) == 1
assert message.message_pieces[0].original_value == "Test skeleton key"
assert message.message_pieces[0].original_value_data_type == "text"
assert basic_context.conversation_id != original_id

async def test_send_skeleton_key_prompt_filtered_response(self, mock_target, mock_prompt_normalizer, basic_context):
"""Test handling of filtered skeleton key prompt response."""
async def test_setup_creates_two_prepended_messages(self, mock_target, basic_context):
attack = SkeletonKeyAttack(
objective_target=mock_target,
prompt_normalizer=mock_prompt_normalizer,
skeleton_key_prompt="Test skeleton key",
skeleton_key_prompt="sk prompt",
skeleton_key_acceptance="acceptance",
)

# Simulate filtered response
mock_prompt_normalizer.send_prompt_async.return_value = None

result = await attack._send_skeleton_key_prompt_async(context=basic_context)

assert result is None

async def test_send_skeleton_key_prompt_uses_correct_converters(
self, mock_target, mock_prompt_normalizer, basic_context
):
"""Test that skeleton key prompt sending uses correct converter configurations."""
from pyrit.prompt_normalizer.prompt_converter_configuration import (
PromptConverterConfiguration,
)
with patch.object(attack._conversation_manager, "initialize_context_async", new_callable=AsyncMock):
await attack._setup_async(context=basic_context)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants